Skip to content

Conversation

@spike-zhu
Copy link
Collaborator

@spike-zhu spike-zhu commented Feb 11, 2026

  1. 在推理脚本中提供 waupup 选项(默认关闭),可使用 --warmup 开启
  2. MLP 部分根据摩尔平台根据传入 Tensor 的 device 信息做判断,如果为摩尔平台使用基于 muDNN 开发 silu_and_mul 替换原有基于 elementwise swiglu,以提升摩尔平台性能,此方法不会影响其他平台。

@spike-zhu spike-zhu requested review from a team and wooway777 February 11, 2026 13:32
@spike-zhu spike-zhu self-assigned this Feb 11, 2026
auto hidden_states_mutable = hidden_states;
auto [gate, up] = gate_up_proj_->forward_split(hidden_states_mutable);
infinicore::Device::Type dev_type = hidden_states->device().getType();
if(dev_type == infinicore::Device::Type::MOORE){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我感觉Device相关的判断理应放到InfiniCore中,不应该放在推理框架层

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants